knitr::opts_chunk$set(echo = FALSE,
                      message = FALSE,
                      warning = FALSE)

library(startR)
library(here)
## here() starts at /Users/margaretwilson/github/scarid-behavior
library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.2.1     ✔ forcats 0.3.0
## ── Conflicts ─────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(mgcv)
## Loading required package: nlme
## 
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
## 
##     collapse
## This is mgcv 1.8-24. For overview type 'help("mgcv-package")'.
library(car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
library(ggfortify)
library(xtable)
library(itsadug)
## Loading required package: plotfunctions
## 
## Attaching package: 'plotfunctions'
## The following object is masked from 'package:ggplot2':
## 
##     alpha
## Loaded package itsadug 2.3 (see 'help("itsadug")' ).
sum_id <- read.csv(here("data","sum_id.csv"), stringsAsFactors=F) %>%
  select(-X)
sum_site <- read.csv(here("data","sum_site.csv"), stringsAsFactors=F) %>%
  select(-X)
sum_ssp <- read.csv(here("data","sum_site_sp_ph.csv"), stringsAsFactors=F) %>%
  select(-X)

1 Examining predictor variables

Notes: in models using benthic-only PCA, scarid biomass and scarid density are highly collinear with benthic PCA variables (understandably). I don’t think I can justify separating them, even though they are my ultimate variables of interest. I can use PCA loadings to interpret model outcomes. Carnivore biomass and spearfishing presence is also highly correlated with scarid biomass and thus benthic PCA variables - I am excluding it for now.

##   species     phase length_cm   scar_bm  pc1_bent  pc2_bent 
##  1.044714  1.769749  1.972271  2.975964  3.013099  1.025941
##   species     phase length_cm   scar_bm  pc1_bent  pc2_bent 
##  1.013643  1.751144  1.759116  1.092945  1.776875  1.673291

Confirming I can’t use scarid biomass and/or density as separate from benthic PCA variables. VIF of scar_bm is just slightly >3. Removing rugosity from benthic PCA reduces scar_bm ~ pc1 correlation slightly, but still >0.7.

BUT: VIF done on LME where random=site or random=island shows values <3 when using benthic PCA values + scarid biomass… stay tuned…

2 Examining potential response variables

Feeding rate (fr) is primary variable of interest

3 Model as LME first, check model validity

I won’t get too far into dealing with heteroskedasticity here, because it may be improved when switching to GAMM

## Linear mixed-effects model fit by REML
##  Data: m_data 
##        AIC      BIC    logLik
##   9285.171 9324.936 -4633.586
## 
## Random effects:
##  Formula: ~1 | island
##         (Intercept) Residual
## StdDev:    333.1291 440.0516
## 
## Fixed effects: fr ~ species + phase + length_cm + scar_bm + pc1_bent + pc2_bent 
##                             Value Std.Error  DF    t-value p-value
## (Intercept)             1367.3937 214.03527 611   6.388637  0.0000
## speciesSparisoma viride -609.9702  36.33018 611 -16.789629  0.0000
## phaset                  -161.2809  49.10680 611  -3.284289  0.0011
## length_cm                 -8.2886   3.45270 611  -2.400608  0.0167
## scar_bm                   -0.0579   0.02100 611  -2.754637  0.0061
## pc1_bent                 -16.6984  46.28563 611  -0.360768  0.7184
## pc2_bent                 -13.1402  37.88623 611  -0.346832  0.7288
##  Correlation: 
##                         (Intr) spcsSv phaset lngth_ scr_bm pc1_bn
## speciesSparisoma viride -0.125                                   
## phaset                   0.213 -0.068                            
## length_cm               -0.374  0.090 -0.649                     
## scar_bm                 -0.132 -0.069 -0.020 -0.057              
## pc1_bent                 0.025 -0.053  0.041 -0.003  0.263       
## pc2_bent                 0.070 -0.042  0.047 -0.033  0.114  0.627
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -2.849038717 -0.572887245 -0.003603645  0.553787264  3.029430226 
## 
## Number of Observations: 620
## Number of Groups: 3
##   species     phase length_cm   scar_bm  pc1_bent  pc2_bent 
##  1.014019  1.749045  1.757738  1.095536  1.766029  1.659981

Dealing with heteroskedasticity: clear heteroskedasticity in my data, likely with multiple sources…
- differences in variance between species, but modelling species independently still does not fully account for heteroskedasticity (below) - scar_cm and pc2_bent have unevenly distributed intervals - scale?

Options/notes from Zuur et al. 2009 (Ch.4):
- add variance covariate (p.78) - log transform DV (p.131)

Dealing with species separately:

## Linear mixed-effects model fit by REML
##  Data: qup_data 
##        AIC      BIC    logLik
##   4375.734 4404.755 -2179.867
## 
## Random effects:
##  Formula: ~1 | island
##         (Intercept) Residual
## StdDev:    523.5168 558.2676
## 
## Fixed effects: fr ~ phase + length_cm + scar_bm + pc1_bent + pc2_bent 
##                 Value Std.Error  DF   t-value p-value
## (Intercept) 1304.6509  353.3326 276  3.692416  0.0003
## phaset      -338.3987  104.0188 276 -3.253244  0.0013
## length_cm     -7.9850    6.8096 276 -1.172614  0.2420
## scar_bm       -0.0701    0.0390 276 -1.799500  0.0730
## pc1_bent     -51.5619   81.1752 276 -0.635192  0.5258
## pc2_bent      -0.8795   67.7295 276 -0.012986  0.9896
##  Correlation: 
##           (Intr) phaset lngth_ scr_bm pc1_bn
## phaset     0.307                            
## length_cm -0.455 -0.740                     
## scar_bm   -0.195 -0.045  0.013              
## pc1_bent   0.054  0.072 -0.049  0.199       
## pc2_bent   0.127  0.101 -0.117 -0.020  0.514
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.56497959 -0.67826396 -0.01176862  0.65662104  3.97128286 
## 
## Number of Observations: 284
## Number of Groups: 3
##     phase length_cm   scar_bm  pc1_bent  pc2_bent 
##  2.227387  2.228749  1.068795  1.452587  1.405196

## Linear mixed-effects model fit by REML
##  Data: vir_data 
##        AIC      BIC    logLik
##   4707.944 4738.361 -2345.972
## 
## Random effects:
##  Formula: ~1 | island
##         (Intercept) Residual
## StdDev:    279.6194 265.9142
## 
## Fixed effects: fr ~ phase + length_cm + scar_bm + pc1_bent + pc2_bent 
##                Value Std.Error  DF   t-value p-value
## (Intercept) 656.6577 176.99082 329  3.710123  0.0002
## phaset      -79.3425  37.22889 329 -2.131207  0.0338
## length_cm    -5.0966   2.82846 329 -1.801883  0.0725
## scar_bm      -0.0248   0.01788 329 -1.386816  0.1664
## pc1_bent     32.6663  40.36506 329  0.809271  0.4189
## pc2_bent     38.8122  33.30933 329  1.165205  0.2448
##  Correlation: 
##           (Intr) phaset lngth_ scr_bm pc1_bn
## phaset     0.165                            
## length_cm -0.354 -0.563                     
## scar_bm   -0.124 -0.020 -0.094              
## pc1_bent  -0.010  0.018  0.047  0.330       
## pc2_bent   0.029  0.009  0.027  0.227  0.699
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.22011301 -0.68846461 -0.02581793  0.65749126  2.54668114 
## 
## Number of Observations: 337
## Number of Groups: 3
##     phase length_cm   scar_bm  pc1_bent  pc2_bent 
##  1.486502  1.508646  1.152854  2.105733  1.958335

Log transforming feeding rate variable - still heteroskedasticity, largely from observations with fr=0?

## Linear mixed-effects model fit by REML
##  Data: m_data_tr 
##        AIC      BIC    logLik
##   358.9782 398.7435 -170.4891
## 
## Random effects:
##  Formula: ~1 | island
##         (Intercept)  Residual
## StdDev:   0.2488745 0.3029778
## 
## Fixed effects: log10(fr/100 + 1) ~ species + phase + length_cm + scar_bm + pc1_bent +      pc2_bent 
##                              Value  Std.Error  DF    t-value p-value
## (Intercept)              1.1060766 0.15757631 611   7.019308  0.0000
## speciesSparisoma viride -0.2852648 0.02501379 611 -11.404302  0.0000
## phaset                  -0.0762528 0.03381223 611  -2.255183  0.0245
## length_cm               -0.0035484 0.00237736 611  -1.492595  0.1361
## scar_bm                 -0.0000528 0.00001446 611  -3.648153  0.0003
## pc1_bent                -0.0041406 0.03217213 611  -0.128703  0.8976
## pc2_bent                -0.0138075 0.02624952 611  -0.526011  0.5991
##  Correlation: 
##                         (Intr) spcsSv phaset lngth_ scr_bm pc1_bn
## speciesSparisoma viride -0.117                                   
## phaset                   0.199 -0.068                            
## length_cm               -0.350  0.090 -0.649                     
## scar_bm                 -0.124 -0.069 -0.020 -0.057              
## pc1_bent                 0.024 -0.052  0.042 -0.003  0.260       
## pc2_bent                 0.066 -0.042  0.048 -0.033  0.114  0.632
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.3446923 -0.4195801  0.2018502  0.6814936  1.9806528 
## 
## Number of Observations: 620
## Number of Groups: 3
##   species     phase length_cm   scar_bm  pc1_bent  pc2_bent 
##  1.014000  1.749203  1.757601  1.093573  1.779181  1.675194

Because of 0 values,

4 Switching to gamm, checking model validity

Notes:
- GLS: option for model when homogeneity of variances is not satisfied, can add weights for residuals. Keep reading Zuur et al. and decide if I need to go beyond this (e.g. GAMM)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr ~ factor(species) + s(length_cm, bs = "cs") + s(scar_bm, bs = "cs") + 
##     s(pc1_bent, bs = "cs") + s(pc2_bent, bs = "cs")
## 
## Parametric coefficients:
##                                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      1084.34      25.95   41.79   <2e-16 ***
## factor(species)Sparisoma viride  -617.91      35.69  -17.31   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df     F  p-value    
## s(length_cm) 1.384e+00      9 4.597 1.60e-11 ***
## s(scar_bm)   2.853e+00      9 1.543  0.00013 ***
## s(pc1_bent)  5.917e+00      9 7.417 1.58e-14 ***
## s(pc2_bent)  6.343e-09      9 0.000  0.50327    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =    0.5   Deviance explained = 50.9%
## GCV = 1.8771e+05  Scale est. = 1.8403e+05  n = 620

## 
## Method: GCV   Optimizer: magic
## Smoothing parameter selection converged after 18 iterations.
## The RMS GCV score gradient at convergence was 0.008373249 .
## The Hessian was positive definite.
## Model rank =  38 / 38 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                    k'      edf k-index p-value
## s(length_cm) 9.00e+00 1.38e+00    1.05    0.86
## s(scar_bm)   9.00e+00 2.85e+00    0.98    0.28
## s(pc1_bent)  9.00e+00 5.92e+00    0.98    0.35
## s(pc2_bent)  9.00e+00 6.34e-09    0.98    0.33

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr ~ factor(species_code) + s(length_cm) + s(scar_bm) + s(pc1_bent, 
##     bs = "cs") + s(pc2_bent, bs = "cs") + s(length_cm, by = as.numeric(species_code == 
##     "qup"))
## 
## Parametric coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                513.53      43.20  11.886   <2e-16 ***
## factor(species_code)stop   -44.49      44.37  -1.003    0.316    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                                                      edf Ref.df      F
## s(length_cm)                                   1.796e+00  2.261  9.489
## s(scar_bm)                                     3.103e+00  3.456  2.545
## s(pc1_bent)                                    5.701e+00  9.000  7.344
## s(pc2_bent)                                    1.188e-09  9.000  0.000
## s(length_cm):as.numeric(species_code == "qup") 8.477e+00  8.717 76.332
##                                                 p-value    
## s(length_cm)                                   4.26e-05 ***
## s(scar_bm)                                       0.0309 *  
## s(pc1_bent)                                    9.01e-15 ***
## s(pc2_bent)                                      0.6341    
## s(length_cm):as.numeric(species_code == "qup")  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Rank: 46/48
## R-sq.(adj) =  0.518   Deviance explained = 53.3%
## GCV = 1.8352e+05  Scale est. = 1.7751e+05  n = 620

## 
## Method: GCV   Optimizer: magic
## Smoothing parameter selection converged after 22 iterations.
## The RMS GCV score gradient at convergence was 0.01293502 .
## The Hessian was positive definite.
## Model rank =  46 / 48 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                                                      k'      edf k-index
## s(length_cm)                                   9.00e+00 1.80e+00    1.07
## s(scar_bm)                                     9.00e+00 3.10e+00    0.96
## s(pc1_bent)                                    9.00e+00 5.70e+00    0.97
## s(pc2_bent)                                    9.00e+00 1.19e-09    0.97
## s(length_cm):as.numeric(species_code == "qup") 1.00e+01 8.48e+00    1.07
##                                                p-value
## s(length_cm)                                      0.93
## s(scar_bm)                                        0.19
## s(pc1_bent)                                       0.23
## s(pc2_bent)                                       0.19
## s(length_cm):as.numeric(species_code == "qup")    0.94
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr ~ factor(species_code) + s(length_cm) + s(scar_bm) + s(pc1_bent) + 
##     s(pc2_bent) + s(pc1_bent, by = as.numeric(species_code == 
##     "qup"))
## 
## Parametric coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                1326.5      384.4   3.451 0.000597 ***
## factor(species_code)stop   -876.3      384.7  -2.278 0.023090 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                                                 edf Ref.df      F  p-value
## s(length_cm)                                  1.000  1.000 52.482 1.25e-12
## s(scar_bm)                                    5.646  6.459  3.128  0.00404
## s(pc1_bent)                                   1.000  1.000  3.768  0.05269
## s(pc2_bent)                                   1.000  1.000  0.167  0.68265
## s(pc1_bent):as.numeric(species_code == "qup") 9.097  9.492 60.195  < 2e-16
##                                                  
## s(length_cm)                                  ***
## s(scar_bm)                                    ** 
## s(pc1_bent)                                   .  
## s(pc2_bent)                                      
## s(pc1_bent):as.numeric(species_code == "qup") ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Rank: 47/48
## R-sq.(adj) =  0.542   Deviance explained = 55.5%
## GCV = 1.7409e+05  Scale est. = 1.6873e+05  n = 620

## 
## Method: GCV   Optimizer: magic
## Smoothing parameter selection converged after 26 iterations.
## The RMS GCV score gradient at convergence was 0.004995864 .
## The Hessian was positive definite.
## Model rank =  47 / 48 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                                                  k'   edf k-index p-value
## s(length_cm)                                   9.00  1.00    1.03   0.795
## s(scar_bm)                                     9.00  5.65    0.93   0.065
## s(pc1_bent)                                    9.00  1.00    0.94   0.045
## s(pc2_bent)                                    9.00  1.00    0.93   0.035
## s(pc1_bent):as.numeric(species_code == "qup") 10.00  9.10    0.94   0.080
##                                                
## s(length_cm)                                   
## s(scar_bm)                                    .
## s(pc1_bent)                                   *
## s(pc2_bent)                                   *
## s(pc1_bent):as.numeric(species_code == "qup") .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr ~ factor(species_code) + s(length_cm) + s(scar_bm) + s(pc1_bent) + 
##     s(pc2_bent) + s(pc2_bent, by = as.numeric(species_code == 
##     "qup"))
## 
## Parametric coefficients:
##                          Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                -329.3      376.8  -0.874   0.3825  
## factor(species_code)stop    781.5      376.9   2.074   0.0385 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                                                 edf Ref.df      F  p-value
## s(length_cm)                                  1.000  1.000 51.811 1.71e-12
## s(scar_bm)                                    4.024  4.403  1.242  0.16901
## s(pc1_bent)                                   3.279  3.697  5.340  0.00151
## s(pc2_bent)                                   1.891  2.147  1.932  0.13791
## s(pc2_bent):as.numeric(species_code == "qup") 7.512  8.156  6.960 7.02e-09
##                                                  
## s(length_cm)                                  ***
## s(scar_bm)                                       
## s(pc1_bent)                                   ** 
## s(pc2_bent)                                      
## s(pc2_bent):as.numeric(species_code == "qup") ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Rank: 47/48
## R-sq.(adj) =  0.534   Deviance explained = 54.8%
## GCV = 1.769e+05  Scale est. = 1.7147e+05  n = 620

## 
## Method: GCV   Optimizer: magic
## Smoothing parameter selection converged after 16 iterations.
## The RMS GCV score gradient at convergence was 0.005074089 .
## The Hessian was positive definite.
## Model rank =  47 / 48 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                                                  k'   edf k-index p-value
## s(length_cm)                                   9.00  1.00    1.06   0.925
## s(scar_bm)                                     9.00  4.02    0.94   0.045
## s(pc1_bent)                                    9.00  3.28    0.95   0.060
## s(pc2_bent)                                    9.00  1.89    0.94   0.055
## s(pc2_bent):as.numeric(species_code == "qup") 10.00  7.51    0.94   0.060
##                                                
## s(length_cm)                                   
## s(scar_bm)                                    *
## s(pc1_bent)                                   .
## s(pc2_bent)                                   .
## s(pc2_bent):as.numeric(species_code == "qup") .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr ~ factor(species_code) + s(length_cm) + s(scar_bm) + s(pc1_bent) + 
##     s(pc2_bent) + s(pc1_bent, by = as.numeric(species_code == 
##     "qup")) + s(pc2_bent, by = as.numeric(species_code == "qup")) + 
##     s(length_cm, by = as.numeric(species_code == "qup")) + s(scar_bm, 
##     by = as.numeric(species_code == "qup"))
## 
## Parametric coefficients:
##                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)                 240.5      175.1   1.374     0.17
## factor(species_code)stop    215.4      175.3   1.229     0.22
## 
## Approximate significance of smooth terms:
##                                                  edf Ref.df     F  p-value
## s(length_cm)                                   2.041  2.575 4.483 0.005652
## s(scar_bm)                                     5.427  6.169 3.005 0.005511
## s(pc1_bent)                                    1.000  1.000 4.875 0.027628
## s(pc2_bent)                                    1.170  1.279 0.501 0.629659
## s(pc1_bent):as.numeric(species_code == "qup")  4.206  4.612 5.055 0.000344
## s(pc2_bent):as.numeric(species_code == "qup")  1.222  1.222 2.280 0.133075
## s(length_cm):as.numeric(species_code == "qup") 7.670  8.554 4.614 2.14e-05
## s(scar_bm):as.numeric(species_code == "qup")   3.798  4.216 0.989 0.477322
##                                                   
## s(length_cm)                                   ** 
## s(scar_bm)                                     ** 
## s(pc1_bent)                                    *  
## s(pc2_bent)                                       
## s(pc1_bent):as.numeric(species_code == "qup")  ***
## s(pc2_bent):as.numeric(species_code == "qup")     
## s(length_cm):as.numeric(species_code == "qup") ***
## s(scar_bm):as.numeric(species_code == "qup")      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Rank: 74/78
## R-sq.(adj) =  0.571   Deviance explained = 58.9%
## GCV = 1.653e+05  Scale est. = 1.5793e+05  n = 620

## 
## Method: GCV   Optimizer: magic
## Smoothing parameter selection converged after 16 iterations.
## The RMS GCV score gradient at convergence was 0.009712756 .
## The Hessian was positive definite.
## Model rank =  74 / 78 
## 
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
## 
##                                                   k'   edf k-index p-value
## s(length_cm)                                    9.00  2.04    1.07   0.940
## s(scar_bm)                                      9.00  5.43    0.91   0.005
## s(pc1_bent)                                     9.00  1.00    0.92   0.015
## s(pc2_bent)                                     9.00  1.17    0.91  <2e-16
## s(pc1_bent):as.numeric(species_code == "qup")  10.00  4.21    0.92   0.020
## s(pc2_bent):as.numeric(species_code == "qup")  10.00  1.22    0.91   0.005
## s(length_cm):as.numeric(species_code == "qup") 10.00  7.67    1.07   0.920
## s(scar_bm):as.numeric(species_code == "qup")   10.00  3.80    0.91   0.015
##                                                   
## s(length_cm)                                      
## s(scar_bm)                                     ** 
## s(pc1_bent)                                    *  
## s(pc2_bent)                                    ***
## s(pc1_bent):as.numeric(species_code == "qup")  *  
## s(pc2_bent):as.numeric(species_code == "qup")  ** 
## s(length_cm):as.numeric(species_code == "qup")    
## s(scar_bm):as.numeric(species_code == "qup")   *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model validity notes:
- Violates assumption of homogeneity of variance (increasing variance as predictor values increase) - Satisfies assumption of normality

need to add interaction terms

Re: Nash et al. 2016: should I be aggregating data to the site level?

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr_mean ~ species_code + phase + s(length_m) + s(pc1_bent) + 
##     s(pc2_bent) + s(scar_bm)
## 
## Parametric coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        821.45      57.24  14.350  < 2e-16 ***
## species_coderbp   -331.56     111.44  -2.975  0.00428 ** 
## species_codestop  -418.31      70.30  -5.950 1.72e-07 ***
## phaset             -89.36      89.78  -0.995  0.32376    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(length_m) 1.496  1.496  1.292    0.425    
## s(pc1_bent) 1.000  1.000 19.514 4.19e-05 ***
## s(pc2_bent) 1.944  1.944  0.726    0.423    
## s(scar_bm)  2.366  2.366  1.518    0.195    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.609   
##   Scale est. = 54592     n = 68

## [1] 883.9623
##    phase length_m  scar_bm pc1_bent pc2_bent 
## 1.597270 2.090804 2.710746 2.493320 1.061033

Site-level GAMM with pca_all

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr_mean ~ species_code + s(length_m) + phase + s(pc1_all) + s(pc2_all)
## 
## Parametric coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        899.30      83.21  10.807 1.73e-13 ***
## species_codestop  -402.57      85.38  -4.715 2.87e-05 ***
## phaset            -207.48     141.06  -1.471    0.149    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(length_m) 1.440  1.440  0.180 0.755593    
## s(pc1_all)  2.152  2.152 10.297 0.000167 ***
## s(pc2_all)  1.000  1.000  0.441 0.510297    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.567   
##   Scale est. = 78385     n = 48

## [1] 635.9635
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr_mean ~ species_code + phase + s(length_m, by = factor(species_code)) + 
##     s(pc2_all) + s(pc1_all, by = factor(species_code))
## 
## Parametric coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1012.28     167.69   6.037 6.41e-07 ***
## species_codestop  -411.27      60.92  -6.751 7.29e-08 ***
## phaset            -319.03     113.20  -2.818  0.00782 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                                        edf Ref.df     F p-value  
## s(length_m):factor(species_code)qup  3.782  3.782 2.817  0.0247 *
## s(length_m):factor(species_code)stop 1.000  1.000 3.284  0.0782 .
## s(pc2_all)                           1.000  1.000 1.025  0.3181  
## s(pc1_all):factor(species_code)qup   2.526  2.526 3.057  0.0413 *
## s(pc1_all):factor(species_code)stop  1.000  1.000 0.610  0.4398  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.526   
##   Scale est. = 37972     n = 48

## [1] 601.1334
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr_mean ~ species_code + phase + s(pc1_all) + s(pc2_all)
## 
## Parametric coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        891.73      69.86  12.764 5.16e-16 ***
## species_codestop  -392.54      81.17  -4.836 1.83e-05 ***
## phaset            -203.02      80.76  -2.514   0.0159 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df      F  p-value    
## s(pc1_all) 2.186  2.186 13.680 1.25e-05 ***
## s(pc2_all) 1.000  1.000  0.422    0.519    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.571   
##   Scale est. = 77691     n = 48

## [1] 642.6481
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## fr_mean ~ species_code + phase + s(pc1_all, by = factor(species_code))
## 
## Parametric coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        885.64      58.59  15.117  < 2e-16 ***
## species_codestop  -388.80      68.04  -5.714 1.07e-06 ***
## phaset            -218.31      67.81  -3.219   0.0025 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                                       edf Ref.df     F  p-value    
## s(pc1_all):factor(species_code)qup  2.605  2.605 22.23 8.28e-09 ***
## s(pc1_all):factor(species_code)stop 1.000  1.000  3.45   0.0703 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.698   
##   Scale est. = 54589     n = 48

## [1] 627.1399
## <!-- html table generated in R 3.5.1 by xtable 1.8-3 package -->
## <!-- Tue Nov 27 13:44:52 2018 -->
## <table border=1>
## <caption align="bottom">   </caption>
##   <tr> <td> A. parametric coefficients </td> <td align="right"> Estimate </td> <td align="right"> Std. Error </td> <td align="right"> t-value </td> <td align="right"> p-value </td> </tr>
##   <tr> <td> (Intercept) </td> <td align="right"> 885.6387 </td> <td align="right"> 58.5860 </td> <td align="right"> 15.1169 </td> <td align="right"> &lt; 0.0001 </td> </tr>
##   <tr> <td> species_codestop </td> <td align="right"> -388.8038 </td> <td align="right"> 68.0447 </td> <td align="right"> -5.7140 </td> <td align="right"> &lt; 0.0001 </td> </tr>
##   <tr> <td> phaset </td> <td align="right"> -218.3100 </td> <td align="right"> 67.8146 </td> <td align="right"> -3.2192 </td> <td align="right"> 0.0025 </td> </tr>
##    <tr> <td> B. smooth terms </td> <td align="right"> edf </td> <td align="right"> Ref.df </td> <td align="right"> F-value </td> <td align="right"> p-value </td> </tr>
##   <tr> <td> s(pc1_all):factor(species_code)qup </td> <td align="right"> 2.6050 </td> <td align="right"> 2.6050 </td> <td align="right"> 22.2338 </td> <td align="right"> &lt; 0.0001 </td> </tr>
##   <tr> <td> s(pc1_all):factor(species_code)stop </td> <td align="right"> 1.0000 </td> <td align="right"> 1.0000 </td> <td align="right"> 3.4499 </td> <td align="right"> 0.0703 </td> </tr>
##    <a name=tab.gam></a>
## </table>